The complete chloroplast genome sequence of Centaurea cyanus (Asteraceae)

Abstract Centaurea cyanus has been a weed in farmland for a long time. In this study, the chloroplast genome of C. cyanus was sequenced to establish the phylogenetic relationship between its genomic characteristics and other related species. The chloroplast gene structure of C. cyanus is a circular molecule with a length of 152,433 bp, including a large single-copy (LSC) region of 83,464 bp, a small single-copy (SSC) region of 18,545 bp, and a pair of inverted repeats sequences (IRs) region of 25,212 bp. The whole genome contains 130 genes, including 86 protein-coding genes, 36 tRNA genes, and eight rRNA genes. Phylogenetic analysis showed that C. cyanus is close to Carthamus. tinctorius, C. tinctorius, C. diffusa, and C. maculosa, and all of them were in one clade. This study provides genetic resource information for the further study of Centaurea.


Introduction
Centaurea cyanus Linnaeus 1753 also known as 'blue cornflower' or 'bachelor's button', belongs to the family Asteraceae ( Figure 1). C. cyanus is annual or biennial, with a height of 30-70 cm or higher. It has grey-green branched stems and a tap root system with lanceolate leaves alternately arranged on the stem 1-4 cm long. Its flowerheads are 1.5-3 cm in diameter, with capitula structured with deep blue sterile ray florets and less showy fertile disk florets, containing a single ovule in each ovary (Tomar 2017;Haratym et al. 2020). It is native to Europe, and was also introduced to North America, where it is considered an invasive species. It appears in autumn and spring fields and often infects winter crops. It has been a weed in farmland for a long time, mainly growing in corn fields or along the edge of farmland (Haratym et al. 2020). Although it is a weed, C. cyanus is a popular garden plant for its unique blue color. In addition, C. cyanus is widely used in the medical field. Its flowers are used as diuretics in the Russian Federation's current pharmacopoeia (Shikov et al. 2017) and also as diuretics and supplements in Scottish medicine (Bouafia et al. 2020). Its flowers are used in European phytotherapy for treating minor ocular inflammation (Garbacki et al. 1999). The polysaccharides extracted from its flower heads had anti-inflammatory properties, which was shown in a previous pharmacological experiment (Pirvu et al. 2012). There are many terpenes belonging to sesquiterpenes making up the main compound of the essential oil extracted from the aerial parts of   C. 9cyanus, they have the potential to treat cancer and cardiovascular diseases and also have the effects of preventing neurodegeneration and treating burns, thus they are used as pharmaceutical preparations (Chadwick et al. 2013). C. cyanus is also of great significance in the cosmetics industry. Aromatic acids extracted from C. cyanus are one of the most common raw materials in cosmetics production (Pirvu et al. 2012). In addition, C. cyanus has many other functions, which will not be described here. Although C. cyanus has been utilized in many fields, its phylogeny has not been fully resolved, and its chloroplast genome structure and feature are still unknown.
In this study, the chloroplast genome was sequenced and characterized to explore its phylogenetic status, and phylogenetic analysis was performed to provide information for further phylogenetic studies.

Methods
Total DNA was extracted from the leaves stored in liquid nitrogen using a DN easy plant tissue kit (TIANGEN Biotech Co., Ltd., Beijing). Then the library was constructed and sequenced using Illumina HiSeq 2500 platform (Shanghai personalbio Technology Co., Ltd., China). As a result, 67,498,968 reads were retained after filtering out the lowquality reads using fastp (Chen et al. 2018). The de novo assembly of the C. cyanus chloroplast genome was performed using GetOrganelle v1.7.5 (Jin et al. 2020), of which the detailed information for assembling is shown in Figure S1, and the CPGAVAS2 (Shi et al. 2019) was used for the chloroplast genome annotation. Finally, the genome map was drawn using CPGView (Liu et al. 2023). Phylogenetic analysis was performed through the following procedure: Firstly, a total of 51 chloroplast genomes were downloaded from GenBank, 73 protein-coding genes shared by all genomes were screened, and after that, MAFFT v7.313 (Rozewicki et al. 2019) was used for separate alignment of each gene. Then Gblocks 0.91b was used for sequence masking of each gene, and end-to-end connections of all the genes were performed to form a supergene of each species (Guo et al. 2022). Maximum likelihood phylogenies were inferred using IQ-TREE (Nguyen et al. 2015) under the TVM þ F þ I þ G4 model for 5000 ultrafast bootstraps, as well as the Shimodaira-Hasegawa-like approximate likelihood-ratio test.

Results
The chloroplast gene structure of Centaurea cyanus is a circular molecule (Figure 2), with a length of 152,433 bp, including four parts: a large single-copy region (LSC) length of 83,464 bp, a small single-copy region (SSC) length of 18,545 bp, and two inverted repeat regions (IRs), each 25,212 bp. The G þ C content was 37.76% for the whole chloroplast genome, and 43.14% for the IRs, which was higher than that in LSC and SSC regions (35.93% and 31.38%, respectively). The genome contains 130 genes, including 86 protein-coding genes, 36 tRNA genes, and eight rRNA genes, and the structure of the cis-splicing genes and trans-splicing genes were shown in supplemental Figure S2. Based on the chloroplast genome of C. cyanus, the Maximum-likelihood (ML) tree was constructed (Figure 3), which shows the phylogenetic placement of Centaurea cyanus. The result showed that C. cyanus is close to, Carthamnus tinctorius, C. diffusa, and C. maculosa, and all of them were in one clade with high support, which is consistent with the previous study (Garcia-Jacas et al. 2001), however, Carthamus tininctorius is also presented in the clade of Centaurea, which requires further study.

Discussion and conclusion
The study reports the chloroplast genome of Centaurea. cyanus for the first time, the phylogenetic result is basically consistent with the previous study (Garcia-Jacas et al. 2001). However, Carthamus tininctorius is also presented in the clade of Centaurea, which is an interesting phenomenon because the circumscription of Centaurea is still not clear yet (Garcia-Jacas et al. 2001). Besides, only a few chloroplast genomes of the genus Centaurea were reported. Thus, further research is needed to understand the phylogenetic location of Centaurea. This study provides genetic resource information for the further study of Centaurea.

Author contributions
Peng Xie and Yun Wang designed the study; Ningyun Zhang, Kerui Huang, Hanbin Yin, Peng Xie, and Ping Mo collected the samples and interpreted the data; Ningyun Zhang drafted the manuscript; Yun Wang and Peng Xie revised the manuscript. All authors read and approved the final version of the manuscript.

Ethical approval
Field studies complied with local legislation, and appropriate permissions were granted before the samples were collected from the Changde Vocational Technical College.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Data availability statement
The complete chloroplast genome sequence of Centaurea cyanus has been deposited in the GenBank database under the accession number NC066898 or OP161554 (these numbers were automatically generated by NCBI and refer to the same sample). The associated BioProject, SRA, and Bio-Sample numbers are PRJNA891141, SRR21929279, and SAMN31311817, respectively.